1,991 research outputs found

    Mental distress detection and triage in forum posts: the LT3 CLPsych 2016 shared task system

    Get PDF
    This paper describes the contribution of LT3 for the CLPsych 2016 Shared Task on automatic triage of mental health forum posts. Our systems use multiclass Support Vector Machines (SVM), cascaded binary SVMs and ensembles with a rich feature set. The best systems obtain macro-averaged F-scores of 40% on the full task and 80% on the green versus alarming distinction. Multiclass SVMs with all features score best in terms of F-score, whereas feature filtering with bi-normal separation and classifier ensembling are found to improve recall of alarming posts

    Economic event detection in company-specific news text

    Get PDF
    This paper presents a dataset and supervised classification approach for economic event detection in English news articles. Currently, the economic domain is lacking resources and methods for data-driven supervised event detection. The detection task is conceived as a sentence-level classification task for 10 different economic event types. Two different machine learning approaches were tested: a rich feature set Support Vector Machine (SVM) set-up and a word-vector-based long short-term memory recurrent neural network (RNN-LSTM) set-up. We show satisfactory results for most event types, with the linear kernel SVM outperforming the other experimental set-ups

    SENTiVENT Event Annotation Guidelines v1.1

    Get PDF
    Annotation Guidelines for economic Events in the SENTiVENT project for economic news text mining. The goal of this annotation scheme is to produce a gold-standard labeled dataset for enabling supervised event extraction in the company-specific news text domain. The guidelines are based on the [Rich-ERE Guidelines][1] for [Events][2] and [Argument Fillers][3] but adapted to a corpus of business and financial news articles. We exclusively annotate event structures, unlike Rich ERE which annotates Entities and Relations separately

    Extracting fine-grained economic events from business news

    Get PDF
    Based on a recently developed fine-grained event extraction dataset for the economic domain, we present in a pilot study for supervised economic event extraction. We investigate how a state-of-the-art model for event extraction performs on the trigger and argument identification and classification. While F1-scores of above 50{%} are obtained on the task of trigger identification, we observe a large gap in performance compared to results on the benchmark ACE05 dataset. We show that single-token triggers do not provide sufficient discriminative information for a fine-grained event detection setup in a closed domain such as economics, since many classes have a large degree of lexico-semantic and contextual overlap

    Production of human recombinant proapolipoprotein A-I in Escherichia coli: purification and biochemical characterization

    Get PDF
    A human liver cDNA library was used to isolate a clone coding for apolipoprotein A-I (Apo A-I). The clone carries the sequence for the prepeptide (18 amino acids), the propeptide (6 amino acids), and the mature protein (243 amino acids). A coding cassette for the proapo A-I molecule was reconstructed by fusing synthetic sequences, chosen to optimize expression and specifying the amino-terminal methionine and amino acids -6 to +14, to a large fragment of the cDNA coding for amino acids 15-243. The module was expressed in pOTS-Nco, an Escherichia coli expression vector carrying the regulatable X P^ promoter, leading to the production of proapolipoprotein A-I at up to 10% of total soluble proteins. The recombinant polypeptide was purified and characterized in terms of apparent molecular mass, isoelectric point, and by both chemical and enzymatic peptide mapping. In addition, it was assayed in vitro for the stimulation of the enzyme lecithin: cholesterol acyltransferase. The data show for the first time that proapo A-I can be produced efficiently in E. coli as a stable and undegraded protein having physical and functional properties indistinguishable from those of the natural product

    Current Limitations in Cyberbullying Detection: on Evaluation Criteria, Reproducibility, and Data Scarcity

    Get PDF
    The detection of online cyberbullying has seen an increase in societal importance, popularity in research, and available open data. Nevertheless, while computational power and affordability of resources continue to increase, the access restrictions on high-quality data limit the applicability of state-of-the-art techniques. Consequently, much of the recent research uses small, heterogeneous datasets, without a thorough evaluation of applicability. In this paper, we further illustrate these issues, as we (i) evaluate many publicly available resources for this task and demonstrate difficulties with data collection. These predominantly yield small datasets that fail to capture the required complex social dynamics and impede direct comparison of progress. We (ii) conduct an extensive set of experiments that indicate a general lack of cross-domain generalization of classifiers trained on these sources, and openly provide this framework to replicate and extend our evaluation criteria. Finally, we (iii) present an effective crowdsourcing method: simulating real-life bullying scenarios in a lab setting generates plausible data that can be effectively used to enrich real data. This largely circumvents the restrictions on data that can be collected, and increases classifier performance. We believe these contributions can aid in improving the empirical practices of future research in the field
    • …
    corecore